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' Abstract 

(N ■ 



Expander graphs are widely used in communication problems and construction of error 
correcting codes. In such graphs, information gets through very quickly. Typically, it is 
not true for social or biological networks, though we may find a partition of the vertices 
such that the induced subgraphs on them and the bipartite subgraphs between any pair 
of them exhibit regular behavior of information fiow within or between the vertex subsets. 
Implications between spectral and regularity properties are discussed. 
' Keywords: Spectral gap; Spectral clustering; Volume regularity. 

u; 

■ 1 Introduction 

We want to go beyond the expander graphs that - for four decades - have played an important 
role in communication netviforks; for a summary, see e.g., Chung [8] and Hoory et al. [14]. 
Roughly speaking, the expansion property means that each subset of the graph's vertices has 
"many" neighbors (combinatorial view), and hence, information gets through such a graph very 
"quickly" (probabilistic view). We will not give exact definitions of expanders here as those 
I contain many parameters which are not used later. We rather refer to the spectral and random 

^\ . walk characterization of such graphs, as discussed, among others by Alon [1], and Meila and 

in ; Shi [17]. 

The general framework of an edge-weighted graph will be used. Expanders have a spectral 
gap bounded away from zero, where - for a connected graph - this gap is defined as the 
minimum distance between the normalized Laplacian spectrum (apart from the trivial zero 
eigenvalue) and the endpoints of the [0,2] interval, the possible range of the spectrum. The 
larger the spectral gap, the more our graph resembles a random graph and exhibits quasi- 
random properties, e.g., the edge densities within any subset and between any two subsets of 
its vertices do not differ too much of what is expected, see the Expander Mixing Lemma 1 of 
' Section 2. Quasi-random properties and spectral gap of random graphs with given expected 

degrees are discussed in Chung and Graham [9], and Coja-Oghlan and Lanka [11]. 
However, the spectral gap appears not at the ends of the normalized Laplacian spectrum in 
case of generalized random or generalized quasi-random graphs that, in the presence of fc > 2 
underlying clusters, have k eigenvalues (including the zero) separated from 1, while the bulk of 
the spectrum is located around 1, see e.g., [6]. These structures are usual in social or biological 
networks having k clusters of vertices (that belong to social groups or similarly functioning 
enzymes) such that the edge density within the clusters and between any pair of the clusters is 
homogeneous. 

Our conjecture is that k so-called structural eigenvalues (separated from 1) in the normalized 
Laplacian spectrum are indications of such a structure, while the near 1 eigenvalues are re- 
sponsible for the pairwisc regularities. The clusters themselves can be recovered by applying 
the fc-means algorithm for the vertex representatives obtained by the eigenvectors correspond- 
ing to the structural eigenvalues (apart from the zero). For the k — 2 case we will give an 
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exact relation between the eigenvalue separation (of the non-trivial structural eigenvalue from 
the bulk of the spectrum) and the volume regularity of the cluster pair that is obtained by 
the A;-means algorithm applied for the coordinates of the transformed eigenvector belonging 
to the non-trivial structural eigenvalue, see Theorem 1 of Section 3. To eliminate the trivial 
eigenvalue-eigenvector pair, we shall rather use the normalized modularity spectrum of [7] that 
plays an important role in finding the extrema of some penalized versions of the Newman-Girvan 
modularity introduced in [18]. Theorem 2 of Section 4 gives an estimation for the extent of 
volume-regularity of the different cluster pairs in the k > 2 case based on the spectral gap and 
the fc-variance of the vertex representatives. 

In [10, 16], the authors give algorithms - based on low rank approximation - to find a regular 
partition if k is known and our graph comes from a generalized random graph model with 
k clusters. Without knowing k, there are constructions ~ like [13] - based on refinement of 
partitions and leading to a very fine partition with number of clusters depending merely on the 
constant ruling the regularity of the cluster pairs. On the contrary, our purpose is to estimate 
the extent of the regularity of the cluster pairs by means of spectral gaps and eigenvectors. 
The estimations given are relevant only in the presence of a large spectral gap (between some 
structural and the other eigenvalues) and special classification properties of the eigenvectors 
corresponding to the structural eigenvalues, see Theorem 2 of Section 4. In this case, the 
algorithm is straightforward via fc-means clustering. 



2 Preliminaries and statement of purpose 

Let G = (V,W) be a graph on n vertices, where the n x n symmetric matrix W has non- 
negative real entries and zero diagonal. Here Wij is the similarity between vertices i and j, 
where similarity means no connection/edge at all. A simple graph is a special case of it with 
0-1 weights. Without loss of generality 

n n 

1=1 j=i 

will be supposed. Hence, W is a joint distribution, with marginal entries 

n 

di=^Wij, i = l,...,n 

which are the generalized vertex degrees collected in the main diagonal of the diagonal degree 
matrix D = diag (d), d = (di, . . . , d„)^. In [4, 5] we investigated the spectral gap of the nor- 
malized Laplacian lijj = I — D~^/^WD~^/^, where I denotes the identity matrix of appropriate 
size. 

Suppose that our graph is connected (W is irreducible). Let = Ai < A2 < ■ ■ • < A,i < 2 
denote the eigenvalues of the symmetric normalized Laplacian with corresponding unit- 
norm, pairwise orthogonal eigenvectors Ui, . . . , u„. Namely, Ui = {\/di, . . . , y^d^)"^ = a/H. In 
the random walk setup D~^W is the transition matrix (its entry in the (i, j)-th position is the 
conditional probability of moving from vertex i to vertex j in one step, given that we are in i) 
which is a stochastic matrix with eigenvalues 1 — A,; and corresponding eigenvectors D~^/^Uj 
(i = l,...,n). "Good" expanders have a A2 bounded away from zero, that also implies the 
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separation of the isoperimctric number 

h{G) = mm , (2) 

(7CV:Vol((7)<i \o[[U) 

where for X,Y CV: w{X,Y) = J^iex E^g 

Y Wij is the weighted cut between X and Y, while 
Vol(C/) = J2ieu^i volume oi U C V. In view of (1), Vol (V^) = 1, this is why the 

minimum is taken on vertex sets having volume at most ^ . In [5] , we proved that 

^A2 </i(G) <min{l,y2A^}, (3) 
while in the A2 < 1 case the stronger upper estimation 



h{G) < v/A2(2- A2) 
holds. (We remark that A2 < always holds.) 

If a network does not have a "large" A2 (compared to the natural lower bound), or equivalently 
- in view of the above inequalities - it has a relatively "small" isoperimetric number, then the 
2-partition of the vertices giving the minimum in (2) indicates a bottleneck, or equivalently, a 
low conductivity edge-set between two disjoint vertex clusters such that the random walk gets 
through with small probability between them, but - as some equivalent notions will indicate - 
it is rapidly mixing within the clusters. To find the clusters, the coordinates of the transformed 
eigenvector D~^/^U2 will be used. In [4], we proved that for the weighted 2-variance of this 
vector's coordinates 

52^(D-V2u2) < ^ (4) 

holds. For a general 2 < k < n, the notion of fc- variance - in the Analysis of Variance sense - is 
the following. The weighted k -variance of the Ai-dimensional vertex representatives , . . . , x^^ 
comprising the row vectors of the n x k matrix X is defined by 

fc 

Sl{X) = min Sm^X) = min V ^ '^^H^.'- ^^H'' (5) 

where = y^^^ ^ Sjey ^j^i weighted center of cluster (a = 1, . . . , fc) and Vk denotes 
the set of fc-partitions of the vertices. We remark that S'|(D^^/^Ui, D^^/^U2) — S'|(D^^/^U2), 
since D~^/^Ui = 1 is the all I's vector. 

The above results were generalized for minimizing the normalized fc-way cut 



a=l b=a+l ^ ^ ' \ I / (J— ]^ \ / 

of the fc-partition P^ ~ (T4, ■ • ■ , 14) over the set of all possible fc-partitions. Let 

/fc(G)- min /fc(Pfc,G) 

rk t ^k 

be the minimum normalized k-way cut of the underlying weighted graph G = (V, W). In fact, 
/2(G) is the symmetric version of the isoperimetric number and /2(G) < 2h{G). In [5] we 
proved that 

fc fc 
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where the upper estimation is relevant only in the case when ^^(ui, . . . ,Uk) is small enough 
and the constant c depends on this minimum fc-variance of the vertex representatives. 
The normalized Newman- Girvan modularity is defined in [7] as the penalized version of the 
Newman-Girvan modularity [18] in the following way. The normalized fc-way modularity of 
Pfc = (Vi,...,T4) is 

fe ^' 
Va) , . 

1 = fc - 1 - Jk[Pk), 



^ Vol(K) 

a— 1 ^ ^ 



and 



Qfe(G) = max Qk{Pk,G) 

Pk £'Pk 



is the maximum normalized k-way Newman-Girvan modularity of the underlying weighted graph 
G = (y, W). For given k, maximizing this modularity is equivalent to minimizing the normal- 
ized cut and can be solved by the same spectral technique. In fact, it is more convenient to 
use the spectral decomposition of the normalized modularity matrix B£)=I — Lp — Vdy/d 
with eigenvalues /3i > • • • > /3n, that are the numbers 1 — with eigenvectors (i ^ 2, . . . ,n) 
and the zero with corresponding unit-norm eigenvector Vd. In [5, 7], we also show that a 
spectral gap between Afc and Afe+i is an indication of k clusters with low inter-cluster connec- 
tions; further, the intra-clustcr connections (wij) between vertices i and j of the same cluster 
are higher than expected under the hypothesis of independence (in view of which the vertices 
arc connected with probability didj). In the random walk framework, the random walk stays 
within the clusters with high probability. 

Conversely, minimizing the above modularity will result in clusters with high inter- and low 
intra-clustcr connections. In [7], wc proved that 

k 

min Qk{Pk,G) > (9) 

l — l 

The existence of k "large" (significantly larger than 1) eigenvalues in the normalized Laplacian 
spectrum, or equivalently, the existence of k negative eigenvalues (separated from 0) in the 
normalized modularity spectrum is an indication of k clusters with the above property. In the 
random walk setup: the walk stays within the clusters with low probability. 
These two types of network structures are frequently called community or anti-community 
structure. These are the two extreme cases, when fk{Pk, G) is cither minimized or maximized, 
and the optimization gives k clusters with either strong intra-clustcr and weak inter-cluster 
connections, or vice versa. Some networks exhibit a more general, still regular behavior: the 
vertices can be classified into k clusters such that the information-flow within them and between 
any pair of them is homogeneous. In terms of random walks, the walk stays within clusters 
or switches between clusters with probabilities characteristic for the cluster pair. That is, 
if the random walk moves from a vertex of cluster Va to a vertex of cluster Vh, then the 
probability of doing this does not depend on the actual vertices, it merely depends on their 
cluster memberships, a,b = 

In this context, we examined the following generalized random graph model, that corresponds 
to the ideal case: given the number of clusters fc, the vertices of the graph independently belong 
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to the clusters; further, conditioned on the cluster memberships, vertices i G Va and j G H 
are connected with probability pab, independently of each other, 1 < a, 6 < fc. Applying the 
results [6] for the spectral characterization of some noisy random graphs, we are able to prove 
that the normalized modularity spectrum of a generalized random graph is the following: there 
exists a positive number 9 < I, independent of n, such that there are exactly k — 1 so-called 
structural eigenvalues of B/j that are greater than 6 — o(l), while all the others are o(l) in 
absolute value. It is equivalent that L^j has k eigenvalues (including the zero) separated from 
1. 

The k = 1 case corresponds to quasi-random graphs and the above characterization corresponds 
to the eigenvalue separation of such graphs, discussed in [9]. The authors also prove some 
implications between the so-called quasi-random properties. For example, for dense graphs, 
"good" eigenvalue separation is equivalent to "low" discrepancy (of the induced subgraphs' 
densities from the overall edge density). 

For the k > 2 case, generalized quasi-random graphs were introduced by Lovasz and T. Sos [15]. 
These graphs are deterministic counterparts of generalized random graphs with the same spec- 
tral properties. In fact, the authors define so-called generalized quasi-random graph sequences 
by means of graph convergence that also implies the convergence of spectra. Though, the spec- 
trum itself does not carry enough information for the cluster structure of the graph, together 
with some classification properties of the structural eigenvectors it does. We want to prove 
some implication between the spectral gap and the volume-regularity of the cluster pairs, also 
using the structural eigenvectors. 

The notion of volume regularity was introduced by Alon et al. [2]. We shall use a slightly 
modified version of this notion. 

Definition 1 Let G = (V, W) he weighted graph with Vol{V) = 1. The disjoint pair {A, B) is 
a-volume regular if for all X d A, Y <Z B we have 

\w{X, Y) - p{A, B) Vol {X) Vol (y ) I < ay/Vol{A)Vol{B), (10) 

where p{A,B) = voil^vJiiB) relative inter-cluster density of {A,B). 

Our definition was inspired by the Expander Mixing Lemma stated e.g., in [14] for regular graphs 
and in [8] for simple graphs in the context of quasi-random properties. Now we formulate it for 
edge-weighted graphs on a general degree sequence. We also include the proof as a preparation 
for the proof of Theorem 1 of Section 3. 

Lemma 1 (Expander Mixing Lemma for Weighted Graphs) Let G ~ {V,W) be a 
weighted graph and suppose that Vol{V) = 1. Then for all X,Y d V : 

\w{X,Y)~ Vol{X)Vol{Y)\ < \\Bd\\ ■ VVol{X){l- Vol{X))Vol{Y){l - Vol{Y)) 

< \\Bd\\-VVoI{X)VoI{Y), 

where 1|Bd]| is the spectral norm of the normalized modularity matrix of G. 

Proof Let X C A, Y C B and Ijj G E" denote the indicator vector U C V. Further, 
X := D^/^i^ and y D^/^ly. 

We use the spectral decomposition D^-'^/^WD^^/^ = J27=iPi^i^J ' where pi ~ I — A; {i = 
2, . . . ,n) are eigenvalues of B/j and pi = 1 with corresponding unit-norm eigenvector Ui = 
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Vd = We remark that Ui is also an eigenvector of B^) corresponding to the eigenvalue 

zero, hence ||B£)|| = maxi>2|pi|. Let x = ^27=1^^^^ ^^'^ y ~ T^^iVi'^i be the expansions 
of X and y in the orthonormal basis ui, . . . , u„ with coordinates Xi = x-^Uj and yi = y^^u,;, 
respectively. Observe that x\ = Vol(X), j/i = Vol (F) and Yili=\A = INlP = Vol(X), 
Er=i Vl = \\y\? = Vol (Y). Based on these, 

n n 

\wiX,Y)-Vo\{X)Vo\{Y)\ ^\J2p,x,y,\ < \\Bd\\ • 

i=2 1=2 



2 

y 

2 i=2 



<mD\ 

< \\Bd\\ ■ \/voi (x)(i - Voi(x))Voi(y)(i - voi(y)) 



< iibdII • x/voi (x)voi (y), 

where we also used the triangle and the Cauchy-Schwarz inequalities. I 

We remark that the spectral gap of G is 1 — ||B£)||, hence - in view of Lemma 1 - the density 
between any two subsets of "good" expanders is near to what is expected. On the contrary, in 
the above definition of volume regularity, the X, Y pairs are disjoint, and a "small" a indicates 
that the {A,B) pair is like a bipartite expander, see e.g., [8]. 

In the next section we shall prove the following statement for the k = 2 case: if one eigenvalue 
jumps out of the bulk of the normalized modularity spectrum, then clustering the coordinates 
of the corresponding transformed eigenvector into 2 parts (by minimizing the 2- variance of its 
coordinates) will result in an a-volume regular partition of the vertices, where a depends on 
the spectral gap. 

We may go further: if A: — 1 (so-called structural) eigenvalues jump out of the normalized 
modularity spectrum, then clustering the representatives of the vertices - obtained by the 
corresponding eigenvectors in the usual way - into k clusters will result in a-volume regular 
pairs, where a depends on the spectral gap (between the structural eigenvalues and the bulk 
of the spectrum) and the /c-variance of the vertex representatives based on the eigenvectors 
corresponding to the structural eigenvalues. In Section 4, we give an estimation for a in the 
k > 2 case; further, we extend the estimation to the clusters themselves. 



3 Eigenvalue separation and volume regularity (k=2 case) 

Theorem 1 Let G = (V,W) is an edge-weighted graph on n vertices, with generalized de- 
grees di,...,dn and D = diag{di, . . . ,dn)- Suppose that Vol{V) = 1. Let the eigenvalues of 
D-i/2\yj)-i/2^ enumerated in decreasing absolute values, be 

P1>\P2\^0 > e>\p,\, i>3. 

The partition (A, B) of V is defined so that it minimizes the weighted 2-variance of the coor- 
dinates o/D~^/^U2, where U2 is the unit-norm eigenvector belonging to p2. Then the {A,B) 

pair is 0{i./ Y^)-volume regular. 
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Proof Wc use the notations of Lemma I's proof. Let X C A, Y C B. Fort short, x := 
Di/2i y := Di/2ly, a := D^/^l^, b := B^/^Ib- With p := B) and M W - pdd^, 

\w{X,Y) - pVoliX)Yol(Y)\ = ll^Mlyl = \x.^(D-^/^WT>-^/^ - pVdVd^)y\. (11) 

Using the spectral decomposition D~-'^/^WD^^/^ = X]"=i Pi^i^f ^-nd the fact that Ui = Vd = 
we can write (11) as 

n 

\{1 - p)xiyi + p2X2y2 +'^PiXiyi\, (12) 

1=3 

where x = Y!^^=i ^i'^i ^"^^ y ~ ^"=1 Vi^i '^^ expansion of x and y in the orthonormal basis 
Ul, . . . , u„ with coordinates Xi = x^u^ and yi = y-^u^, respectively. 

First we will prove that 1 — p is governed by p2', more precisely, |1 — p| < |p2|+£- Applying 
the arguments of Lemma 1 and the above formulas for the special A, B C V yields 

Vol (^)Vol (B) • (p - 1) = w{A, B) - Vol (A) Vol (B) = 

T " flS) 

= a^(D-i/2WD-i/2 _ ^y/d )b = p2a262 + ^P.a,6„ ^ ' 

1=3 

where a = X]r=i ^i'^i ^^'^ ^ = ^i^i is the expansion of a and b in the orthonormal basis 

Ul, . . . , u„, respectively. The separation of A and B is based on the vector D~^/^U2 which has 
both negative and positive coordinates, since U2 is orthogonal to Ui of all positive coordinates. 
With formulas, a + b ~ Ui, and hence, 02 + &2 = uf"u2 = 0. (If it is the eigenvalue A2 of the 
normalized Laplacian that is the farthest from 1, then the corresponding eigenvector, our U2, is 
also called "Fiedler- vector" as the two-partition of the vertices into two loosely connected parts 
was based on the signs of its coordinates in the early paper of Fiedler [12]). If 9 is much larger 
than e, the first term in the last formula of (13) - apart from a term of Ode]) - will dominate 
the sign of p — 1 which is therefore opposite to the sign of p2- 
Therefore, we will distinguish between two cases. 

• If A2 < 1 — £, then p2 = 1 — A2 > e > 0, and in view of the inequalities between 
the minimum normalized cut and the smallest positive normalized Laplacian eigenvalue 
(apply (7) for the k = 2 case): 

p>/2(G) = min ""^^'^^ > A2 ^ I-P2, (14) 

^-^^ ' ucvYo\{U)Yo\{U) - ^ ^ ' 

therefore 1 — p < p2, as 1 — p is also positive due to the considerations before. Further, 
the estimation, due to (4), 

5|(D-V2u2) < ^ < ^ = ^ (15) 

A3 1 — £ 1 — e 

also follows. 

• If 1 — e < A2 < I then - provided < e - it is the eigenvalue A„ that is the farthest 
from 1, and hence, greater than 1 -|~ e. Consequently, — e < p2 = 1 — A„ < 0, and hence, 
by (8) and (9): 

P2 + Pnecj < Q2(iA, B), W) = (2 - 1) - f2{{A, B), W) = 1 - p. 
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where pneg = niin{l — A„_i,0}, and \pneg\ < £• Note, that in this case 1 — p is negative 
that yields |1 — p| < |p2| + £■ Now the optimum A,B is obtained by minimizing the 
2-variance of the coordinates of the transformed eigenvector D~^/^U2 (now U2 belongs to 
A,i and p2 at the same time) for which the following relation - like (4) - can be proved: 

5|(D-V2u2) ^ 0{^^) = = 0{\^), (16) 

2 - A„_i Pn-l +1 1 - £ 

where /3's are eigenvalues of the normalized modularity matrix. Indeed, in lack of domi- 
nant vertices, there is a relation between the largest and smallest normalized Laplacian 
eigenvalues of G and G, respectively, where the complement graph G — (V, W) is defined 
such that Wij = 1 — Wij (i ^ j) and wu = (z = 1, . . . , n). 

If the two largest absolute value eigenvalues of the normalized modularity matrix are of 
different sign, then we are able to find a gap at least 9 — e between eigenvalues of the 
same sign. 

Therefore, (12) can be estimated from above with 

n 

\P2\ ■ \xiyi +X2V2\ +£Xiyi +max|pi| • \ 'S^ Xiyi\. (17) 

z>3 ^ — ^ 

1=3 

As for the second term, exiyi = eVol {X)Vo\ (Y), so it docs not need further treatment. 
Using the Cauchy-Schwarz inequality, the last term can be estimated from above with 



\ i=3 i=3 \ 



E ^"^ E - £v/Voi {x){i - Vol (x))Voi (y)(i - Vol (Y)) < £ v/voU^yvJ(r) , 



since XI = Vol (X), yi = Vol (Y) and Eti = ll^f = Vol (X), yf - ||y||2 = Vol (Y). 

The first term is reminiscent of an equation for the coordinates of orthogonal vectors. Therefore, 
we project the vectors Ui, U2 onto the subspace F = Span {a, b}. In fact, Ui = a + b, and 
hence, Ui G F. The vector U2 can be decomposed as 

U2 = — ^T-rra H + q, (18) 

Vol (A) Vol (B) ^ ^ 

where q is the component orthogonal to F. For the squared distance ||q|p between U2 and 
F, in [4], we proved that it is equal to the weighted 2-variance S'|(D~^/^U2) and in (15) we 
estimated it from above with -j^f . (In the p2 = i — Xn case similar upper estimation works 
using (16)). Let denote this minimum 2-variance of the coordinates of D~^/^U2 (in both 
cases). 

To estimate ai&i 4-0262 = (uf a)(uf b) + (u^a)(u^b), the problem is that the pairwise orthogo- 
nal vectors ui , U2 and a, b are not in the same subspace of R" as, in general, U2 ^ F. However, 
by an argument proved in [4] , we can find orthogonal, unit-norm vectors Ui , U2 G F such that 

||ui-Ui||2-H||u2-U2|P <2s2, (19) 

where, in view of Ui e F, Ui = Ui. Let r := U2 — U2. Since u^a, uja and ufb, u^^b are 
coordinates of the orthogonal vectors a, b in the basis Ui, U2, 

(ufa)(ufb) + (u^a)(u^b) = 0. 
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and because of uja + ul'b ~ ujui = 0, 

u^a = -u^b = v^Vol (A) Vol (B) =: c. 

Therefore, 

|(ufa)(ufb) + (u^a)(u^b)| = | Vol (A) Vol (B) + [(u2 + r)^a][(u2 + r)^b]| 
= |Vol(A)Vol(B) + [c + r^a][-c + r^b]| = |c(-r^a + r^b) + (r^a)(r^b)| 
< |c|Vl|r||2||b-aiP + V||r|P||a||Vl|r|P||b|P 



< v'Vol(^)Vol(B)(|r|| + ||r||2) < ^Vol{A)Vo\ {B){V2s + 2s^), 
using (19) and the fact that ||b — a|p = 1. 

Now we esthiiate xij/i + 2:22/2 = (ui"x)(uj^y) + (u|'x)(ujy). Going back to (18) we have 

rp U^SL rp uTb rp rp Vol (X) rp rp 

u, X = ^— — a X H — -b X + q X = tttUt a + q x, 

^ Vol (A) Vol (B) ^ Vol (A) ^ ^ ' 

and similarly, 

u^y ^ -5^a^y + -5^b^y + q^y = Y^^u^h + q^y, 
Vol (A) Vol(B) ^^^^ Vo\{B) ^ •>'' 

that in view of ||q|p = yields 

xiyi +X2y2 = |(ufx)(ufy) + (u^x)(ufy)| = 

iVol (X)Vol iY) + (|Lglu-a + q-x)(^u-b + q-y)| 

.|Vol(X)Vol(.).(M|l„..K^u^b), 

+ I|q|ll|x||^||^||u2||||b|| + ||q||||y||^gl||u2||||a|| + ||q|n|x||||yi| 

< V Vol (A) Vol {B){V2.s + 2s') + ||q||/Vor^^H|| VVoTCbJ 

+ iiqij^voUFy^^v/voTII) 

= VVol(A)Vol(S)(V2,s + 2s') + llqll v/W(X)v/Vor(F)(^^^E + ^^^^ 

V Vol (B) y/Yo\ (A) 

< ^/Yol {A)Yol{B)[{V2s + 2s' + s(2 + s)] = ^Vol (A) Vol iB)[{V2 + 2)s + 3s'] 

< y/Yol {A)Yol{B){V2 + 5)s. 

Summarizing, the second and third terms in (17) are estimated from above with 
e^/Vol (X)Vol (Y) < e^Vol (A) Vol {B). Because of e < 0, by an easy calculation it follows 



|q|l) 
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4 ANALYSIS OF VARIANCE SETUP (THE K > 2 CASE) 



that it is less than \J\^- Therefore, the constant a of the [A, B) pair's regularity is 0{^J^^). 



Remark 1 The statement has relevance only if 6 is much larger than e. In this case the 
spectral gap between the largest absolute value eigenvalue and the others in the normalized 
modularity spectrum indicates a regular 2-partition of the graph that can be constructed based 
on the eigenvector belonging to the structural eigenvalue. 

4 Analysis of Variance setup (the k > 2 case) 

Theorem 2 Let G ~ (V, W) is an edge-weighted graph on n vertices, with generalized de- 
grees di,...,dn and D ~ diag(di, . . . ,dn)- Suppose that Vol{V) = 1. Let the eigenvalues of 
D-i/2w/'j)-i/2^ enumerated in decreasing absolute values, be 

1 = Pi > |P2| > • • • > IPfcl > £ > i>k + l. 

The partition [Vi, . . . ,Vk) of V is defined so that it minimizes the weighted k-variance of the 
vertex representatives obtained as row vectors of the nx k matrix X of column vectors D^^^^u^, 
where U; is the unit-norm eigenvector belonging to pi (z = 1, . . . , fc). With the notation s^ = 
SfpC), the (VijVj) pairs are 2{\/2s -\- e) -volume regular ii ^ j) and for the clusters Vi (i = 
1, . . . , fc) the following holds: for all X,Y d Vi we have that 

\w{X,Y) - p{V)Vol{X)Vol{Y)\ < 2{V2s + e)Vol{V), (20) 

where p{Vi) = is the relative intra-cluster density ofVi. 

Proof Denoting by Ui , . . . , the eigenvectors belonging to the so-called structural eigen- 
values pi, . . . , pk, the representatives ri, . . . , r„ of the vertices are row vectors of the matrix 
X = (xi, . . . , Xfc), where = D~^/^Ui (i = l,...,k) and the trivial xi = 1 (belonging to 
pi ~ 1) can be omitted, see (5). The minimum A:-variancc S'|(X) of the fc-dimensional (actu- 
ally, (fc— l)-dimensional) representatives is as small as s^. Suppose that the minimum fc-variance 
is attained by the fc-partition (Vi, . . . , 14) of the vertices. 
By an easy analysis of variance argument of [5, 6] it follows that 

fc 

=^dist'(u„^^), 

1=1 

where F = Span{D^/^zi, . . . , D^/^z^} with the so-called normalized partition vectors Zi, . . . , 
of coordinates zjj = y^i^f^y.-^ if j £ Vi and 0, otherwise {i = 1, . . . ,fc). Note that the vectors 

D^/^Zi, . . . ,D^/^Zfc form an orthonormal system. By [4, 5] we can find another orthonormal 
system vi , . . . , v/; € such that 

^||u,-v„|p<2.2. 

j=i 
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With these vectors, we construct the following /c-rank approximation of the matrix 
]3-i/2-y^]3-i/2 _ Pi^i^T '■ approximated by J2i=i Pi'^i'^T with the following ac- 

curacy (in spectral norm): 



^p,u,uf-^p,v,vf II < ^ |p,H|u,uf-v,vf 11 + 11 J2 II < 



i=l 



i=l 



=k+l 



V2s- 
\ i=i 



where at is the angle between Ui and Vi, and for it, sin ^ 

1,, 



sin^ (T, 



(2sm — cosy)^ 



Ml 



I II Mi — Vi\\ holds, therefore 
Wi|p(4- ||u, - i== l,...,fc. 



(21) 



Hence, the above difference can be estimated from above with \/2s + e in spectral norm. 
Based on these considerations and the fact that the cut norm is less than or equal to the spectral 
norm, the densities to be estimated in the defining formula (10) of volume regularity can be 
written in terms of stepwise constant vectors in the following way. The vectors := D^^/^Vj 
are stepwise constants on the partition (Vi, . . . , 14), i = 1, . . . , fc. The matrix X]i=i PiYiyf is 
therefore a symmetric block-matrix on fc x fc blocks belonging to the above partition of the 
vertices. Let Wab denote its entries in the (a, b) block (a, 6 = 1, ... , k). Using (21), the following 
approximation of the matrix W is performed: 



|W-D(^p.y.yf )D|| = \\r>'^\r>-'/'-WT>-'^'-J2 P^^^^f)^'^"\\ ^ l|D|r/^(V2.+£)||D|| ^ 



/2 



Therefore, the entries of W - for i € 14, j G Vf, - can be decomposed as 

Wij = didjWab + flij, 

where the cut norm and spectral norm of the n x n symmetric error matrix E = (77^) is at 
most ||D||(-\/2s -I- e). But we will restrict the error matrix to Vq x Vf,: its entries are r^^^'s for 
i £ Va,j € Vb, and zeros otherwise. Denoting the restricted matrix by E°^, and the restricted 
diagonal matrices by and D^, respectively, the following finer estimation holds: 

llE^^lln < I D"i|i/2 • IID^Ili/^ . (^s + e) < ^Vol (K)Vol (n)(V2s + e). 

Consequently, for a,b = 1, . . . , fc: 



\wiX, Y) - p{Va,Vb)Yo\ (X)Vol (F)| 
Vol(X)Vol(y) 



'J Yol(Va)Yol(Vb) ^ ^ 



< 2{V2s + e)v/Vol {Va)Yo\{Vb), 



that gives the required statement both in the a ^ b and a = b case. 

Remark 2 In the k ~ 2 case, the estimate of Theorem 1 has the same order of magnitude 
as that of Theorem 2, since s^ = 0{\ rjE^)- statement has only relevance for an integer 
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k G [2, n) such that there is a remarkable spectral gap between 6 := \pk\ and |pfe+i| in the 
normalized modularity spectrum, i.e., the so-called structural eigenvalues pi, . . . , pk are far apart 
from zero, while the others are in an e distance from zero, in absolute value. This is a necessary 
condition for to be "small". As it is not sufficient, instead of 9 and e, the estimation of 
Theorem 2 is given in terms of s and e. Indeed, by perturbation results of spectral subspaces for 
symmetric matrices [3], itself can be estimated from above by the spectral gap between the k 
structural and the other eigenvalues when p2, . . ■ ,Pk have the same sign (the situation of strong 
community or anti- community structure). 
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